Approximate Similarity Queries: A Survey

نویسندگان

  • Paolo Ciaccia
  • Marco Patella
چکیده

We review the major paradigms for similarity queries, in particular those that allow approximate results. We propose an original classification schema which easily allows existing approaches to be compared along several independent coordinates, such as quality of results, error metrics, and user interaction.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Approximate similarity search: A multi-faceted problem

a r t i c l e i n f o a b s t r a c t We review the major paradigms for approximate similarity queries and propose a classification schema that easily allows existing approaches to be compared along several independent coordinates. Then, we discuss the impact that scheduling of index nodes can have on performance and show that, unlike exact similarity queries, no provable optimal scheduling str...

متن کامل

Approximate Queries on Set-valued Attributes

Sets and sequences are commonly used to model complex entities. Attributes containing sets or sequences of elements appear in various application domains, e.g., in telecommunication and retail databases, web server log tools, bioinformatics, etc. However, the support for such attributes is usually limited to definition and storage in relational tables. Contemporary database systems don’t suppor...

متن کامل

Towards Effective Log Summarization

Database access logs are the canonical go-to resource for tasks ranging from performance tuning to security auditing. Unfortunately, they are also large, unwieldy, and it can be difficult for a human analyst to divine the intent behind typical queries in the log. With an eye towards creating tools for ad-hoc exploration of queries by intent, we analyze techniques for clustering queries by inten...

متن کامل

Using the Distance Distribution for Approximate Similarity Queries in High-Dimensional Metric Spaces

We investigate the problem of approximate similarity (nearest neighbor) search in high-dimensional metric spaces, and describe how the distance distribution of the query object can be exploited so as to provide probabilistic guarantees on the quality of the result. This leads to a new paradigm for similarity search, called PAC-NN (probably approximately correct nearest neighbor) queries, aiming...

متن کامل

Approximate Tree Embedding for Querying XML Data

Querying heterogeneous collections of data-centric XML documents requires a combination of database languages and concepts used in information retrieval, in particular similarity search and ranking. In this paper we present an approach to find approximate answers to formal user queries. We reduce the problem of answering queries against XML document collections to the well-known unordered tree ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001